Advice-Taking Metaanalysis

Summer 2025 Report

Author

Jessica Helmer

Published

August 20, 2025

Overview

I have been working on converting the Himmelstein (2022) advice-taking dual-hurdle model from a two-level cross-classified model, with responses cross-classified within judges and items, into a three-level cross-classified model, with responses cross-classified within judges and items and then nested within studies. To do this, I added a third level to the multilevel structure of the model that would account for study-level variance in the intercept estimates in both the part of the model that estimates the probabilities of Decline, Adopt, and Compromise (DAC) as well as the part that estimates the continuous Weight of Advice (WOA) values by modifying the Stan code and implementation in R. I then fit the model to the full dataset of WOA studies.

Code

Stan Model Code
data {
  
  int<lower = 1> N;  // number of observations
  int<lower = 2> ncat;  // number of DAC categories
  int Y1[N];  // categorical response variable for hurdle (D = 2, A = 3, or C = 1?)
  int<lower = 1> K;  // number of population-level effects post hurdle
  matrix[N, K] X;  // population-level design matrix post hurdle
  
  // data for group-level effects of ID 1 = person
  int<lower = 1> N_1;  // number of grouping levels
  int<lower = 1> M_1;  // number of coefficients per level (random effects)
  int<lower = 1> J_1[N];  // grouping indicator per observation
  vector[N] Z_1_1;  // group 1 (person)-level predictor values
  
  // data for group-level effects of ID 2 = item
  int<lower = 1> N_2;  // number of grouping levels
  int<lower = 1> M_2;  // number of coefficients per level
  int<lower = 1> J_2[N];  // grouping indicator per observation
  vector[N] Z_2_1;  // group 2 (item)-level predictor values
  
  // data for group-level effects of ID 3 = study
  int<lower=1> N_3;  // number of grouping levels
  int<lower=1> M_3;  // number of coefficients per level
  int<lower=1> J_3[N];  // grouping indicator per observation
  vector[N] Z_3_1;  // group 2 (study)-level predictor values
  
  // data for outcomes
  vector[N] Y2;  // response variable
  vector[N] Y2_complete;  // response variable
  
  // priors
  vector[K - 1] b_prior;
  int prior_only;  // should the likelihood be ignored?
  
  // other stuff
  int<lower = 1> N_test;  // number of observations in test data
  matrix[N_test, K - 1] X_test;  // test data
  
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
}
parameters {
  
  vector[Kc] b_eta1;  // population-level effects
  real Intercept_eta1;  // temporary intercept for centered predictors
  vector[Kc] b_eta2;  // population-level effects
  real Intercept_eta2;  // temporary intercept for centered predictors
  
  vector<lower = 0>[M_1] sd_1;  // group-level standard deviations
  vector[N_1] z_1[M_1];  // standardized group-level effects
  vector<lower = 0>[M_2] sd_2;  // group-level standard deviations
  vector[N_2] z_2[M_2];  // standardized group-level effects
  vector<lower = 0>[M_1] sd_3;  // group-level standard deviations
  vector[N_1] z_3[M_1];  // standardized group-level effects
  vector<lower = 0>[M_2] sd_4;  // group-level standard deviations
  vector[N_2] z_4[M_2];  // standardized group-level effects
  
  vector<lower=0>[M_3] sd_5;  // group-level standard deviations
  vector[N_3] z_5[M_3];  // standardized group-level effects
  vector<lower = 0>[M_3] sd_6;  // group-level standard deviations
  vector[N_3] z_6[M_3];  // standardized group-level effects
  
  vector[Kc] b;  // population-level effects
  real Intercept;  // temporary intercept for centered predictors
  vector<lower = 0>[M_1] sd_1a;  // group-level standard deviations
  vector[N_1] z_1a[M_1];  // standardized group-level effects
  vector<lower = 0>[M_2] sd_2a;  // group-level standard deviations
  vector[N_2] z_2a[M_2];  // standardized group-level effects
  vector<lower = 0>[M_1] sd_1b;  // group-level standard deviations
  vector[N_1] z_1b[M_1];  // standardized group-level effects
  vector<lower = 0>[M_2] sd_2b;  // group-level standard deviations
  vector[N_2] z_2b[M_2];  // standardized group-level effects
  
  vector<lower = 0>[M_3] sd_3a;  // group-level standard deviations
  vector[N_3] z_3a[M_3];  // standardized group-level effects
  vector<lower = 0>[M_3] sd_3b;  // group-level standard deviations
  vector[N_3] z_3b[M_3];  // standardized group-level effects

  real Intercept_phi;  // temporary intercept for centered distance var
  vector[Kc] b_dist;  // population-level effects
  
}
transformed parameters {
  vector[N_1] r_1_eta1_1;  // actual group-level effects
  vector[N_2] r_2_eta1_1;  // actual group-level effects
  vector[N_1] r_3_eta2_1;  // actual group-level effects
  vector[N_2] r_4_eta2_1;  // actual group-level effects
  
  vector[N_3] r_5_eta1_1;  // actual group-level effects
  vector[N_3] r_6_eta2_1;  // actual group-level effects
  
  vector[N_1] r_1_1;  // actual group-level effects
  vector[N_2] r_2_1;  // actual group-level effects
  vector[N_1] r_1_1b;  // actual group-level effects
  vector[N_2] r_2_1b;  // actual group-level effects
  
  vector[N_3] r_3_1;  // actual group-level effects
  vector[N_3] r_3_1b;  // actual group-level effects
  
  
  r_1_eta1_1 = (sd_1[1] * (z_1[1]));
  r_2_eta1_1 = (sd_2[1] * (z_2[1]));
  r_3_eta2_1 = (sd_3[1] * (z_3[1]));
  r_4_eta2_1 = (sd_4[1] * (z_4[1]));
  
  r_5_eta1_1 = (sd_5[1] * (z_5[1]));
  r_6_eta2_1 = (sd_6[1] * (z_6[1]));
  
  r_1_1 = (sd_1a[1] * (z_1a[1]));
  r_2_1 = (sd_2a[1] * (z_2a[1]));
  r_1_1b = (sd_1b[1] * (z_1b[1]));
  r_2_1b = (sd_2b[1] * (z_2b[1]));
  
  r_3_1 = (sd_3a[1] * (z_3a[1]));
  r_3_1b = (sd_3b[1] * (z_3b[1]));

}
model {
  
  // initialize linear predictor term
  vector[N] eta1 = Intercept_eta1 + Xc * b_eta1;
  // initialize linear predictor term
  vector[N] eta2 = Intercept_eta2 + Xc * b_eta2;
  // linear predictor matrix
  vector[ncat] eta[N];
  
  // beta post hurdle
  // initialize linear predictor term
  vector[N] mu = Intercept + Xc * b;
  vector[N] phi = Intercept_phi + Xc * b_dist;
  vector[N] A;                         // parameter for beta distn
  vector[N] B;                         // parameter for beta distn
  
  for (n in 1:N) {
    // add more terms to the linear predictor
    eta1[n] += r_1_eta1_1[J_1[n]] * Z_1_1[n] + r_2_eta1_1[J_2[n]] * Z_2_1[n] + r_5_eta1_1[J_3[n]] * Z_3_1[n];
    // add more terms to the linear predictor
    eta2[n] += r_3_eta2_1[J_1[n]] * Z_1_1[n] + r_4_eta2_1[J_2[n]] * Z_2_1[n] + r_6_eta2_1[J_3[n]] * Z_3_1[n];
    eta[n] = [0, eta1[n], eta2[n]]';
    // add more terms to the linear predictor
    mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_2_1[J_2[n]] * Z_2_1[n] + r_3_1[J_3[n]] * Z_3_1[n];
    phi[n] += r_1_1b[J_1[n]] * Z_1_1[n] + r_2_1b[J_2[n]] * Z_2_1[n] + r_3_1b[J_3[n]] * Z_3_1[n];
    A[n] = inv_logit(mu[n]) * exp(phi[n]);
    B[n] = (1 - inv_logit(mu[n])) * exp(phi[n]);
  }
  
  // priors including all constants
  target += normal_lpdf(b_eta1 | 0, b_prior);
  b_eta1 ~ normal(0, b_prior);
  target += normal_lpdf(b_eta2 | 0, b_prior);
  target += normal_lpdf(Intercept_eta1 | 0, 5);
  target += normal_lpdf(Intercept_eta2 | 0, 5);
  target += student_t_lpdf(sd_1 | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_1[1]);
  target += student_t_lpdf(sd_2 | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_2[1]);
  target += student_t_lpdf(sd_3 | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_3[1]);
  target += student_t_lpdf(sd_4 | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_4[1]);
  
  target += student_t_lpdf(sd_5 | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_5[1]);
  target += student_t_lpdf(sd_6 | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_6[1]);
  
  // priors including all constants
  target += normal_lpdf(b | 0, b_prior);
  target += normal_lpdf(Intercept | 0, 5);
  target += normal_lpdf(b_dist | 0, b_prior);
  target += normal_lpdf(Intercept_phi | 0, 5);
  target += student_t_lpdf(sd_1a | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_1a[1]);
  target += student_t_lpdf(sd_2a | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_2a[1]);
  target += student_t_lpdf(sd_1b | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_1b[1]);
  target += student_t_lpdf(sd_2b | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_2b[1]);
  
  target += student_t_lpdf(sd_3a | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_3a[1]);
  target += student_t_lpdf(sd_3b | 3, 0, 2.5)
  - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += std_normal_lpdf(z_3b[1]);
  
  // likelihood including all constants
  if (!prior_only) {  
    for (n in 1:N) {
      if (Y1[n] == 2){
        2 ~ categorical_logit(eta[n]);
      }
      else if (Y1[n] == 3){
        3 ~ categorical_logit(eta[n]);
      }
      else {
        1 ~ categorical_logit(eta[n]);
        Y2[n] ~ beta(A[n], B[n]);
      }
    }
  }
}


generated quantities {
  // actual population-level intercept
  real b_eta1_Intercept = Intercept_eta1 - dot_product(means_X, b_eta1);
  // actual population-level intercept
  real b_eta2_Intercept = Intercept_eta2 - dot_product(means_X, b_eta2);
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  // actual population-level intercept
  real b_Intercept_phi = Intercept_phi - dot_product(means_X, b_dist);
  // initialize linear predictor term
  real eta1;
  // initialize linear predictor term
  real eta2;
  //real A[N];
  //real B[N];
  vector[N_test] eta1_t = b_eta1_Intercept + X_test * b_eta1;
  vector[N_test] eta2_t = b_eta2_Intercept + X_test * b_eta2;
  // initialize linear predictor term
  //testing results 
  real pc1_t[N_test];
  real pc2_t[N_test];
  real pc3_t[N_test];
  vector[ncat] eta;
  vector[N] log_lik;
  real y_rep[N];
  real y1_rep;
  
  // initialize linear predictor term
  real mu;
  real phi;
  vector[N_test] mu_t = b_Intercept + X_test * b;
  vector[N_test] phi_t = exp(b_Intercept_phi + X_test * b_dist);
  real A_t[N_test];
  real B_t[N_test];
  
  for (n in 1:N) {
    // add more terms to the linear predictor
    eta1 = Intercept_eta1 + Xc[n] * b_eta1 + r_1_eta1_1[J_1[n]] * Z_1_1[n] + r_2_eta1_1[J_2[n]] * Z_2_1[n] + r_5_eta1_1[J_3[n]] * Z_3_1[n];
    // add more terms to the linear predictor
    eta2 = Intercept_eta2 + Xc[n] * b_eta2 + r_3_eta2_1[J_1[n]] * Z_1_1[n] + r_4_eta2_1[J_2[n]] * Z_2_1[n] + r_6_eta2_1[J_3[n]] * Z_3_1[n];
    eta = [0, eta1, eta2]';  
    // add more terms to the linear predictor
    mu = Intercept + Xc[n] * b + r_1_1[J_1[n]] * Z_1_1[n] + r_2_1[J_2[n]] * Z_2_1[n] + r_3_1[J_3[n]] * Z_3_1[n];
    phi = Intercept_phi + Xc[n] * b_dist + r_1_1b[J_1[n]] * Z_1_1[n] + r_2_1b[J_2[n]] * Z_2_1[n] + r_3_1b[J_3[n]] * Z_3_1[n];
    y1_rep = categorical_logit_rng(eta);
    if (y1_rep == 2){
      y_rep[n] = 0;
    } 
    else if (y1_rep == 3){
      y_rep[n] = 1;
    }
    else {
      y_rep[n] = beta_rng(inv_logit(mu) * exp(phi), (1 - inv_logit(mu)) * exp(phi));
    }
    if (Y1[n] == 2){
      log_lik[n] = categorical_logit_lpmf(2 | eta);
    }
    else if (Y1[n] == 3){
      log_lik[n] = categorical_logit_lpmf(3 | eta);
    }
    else {
      log_lik[n] = categorical_logit_lpmf(1 | eta);
      log_lik[n] += beta_lpdf(Y2[n] | inv_logit(mu) * exp(phi), (1 - inv_logit(mu)) * exp(phi));
    }
    
  }
  for (n2 in 1:N_test){
    pc2_t[n2] = exp(eta1_t[n2])/(1 + exp(eta1_t[n2]) + exp(eta2_t[n2]));
    pc3_t[n2] = exp(eta2_t[n2])/(1 + exp(eta1_t[n2]) + exp(eta2_t[n2]));
    pc1_t[n2] = 1 - pc2_t[n2] - pc3_t[n2];
    A_t[n2] = inv_logit(mu_t[n2]) * phi_t[n2];
    B_t[n2] = (1 - inv_logit(mu_t[n2])) * phi_t[n2];
  }
}
R Code Model Setup
here::here()

library(rstan)
library(bridgesampling)
library(loo)
library(bayesplot)
library(bayestestR)
library(tidyverse)

options(scipen = 999)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = T)

set.seed(123)

dat_full <- read.csv(here::here("woa_datasets.csv"))

studynames <- dat_full %>%
  select(!c(gender, published, base, female_percentage,
            mean_age, student_judge, almanac, future,
            incentive, abs_value, winsor)) %>%
  mutate(distance = abs(firstestimate - advice) / firstestimate) %>%
  filter(!is.na(woa_raw) & !is.na(zconfidence) & distance < 1) %>%
  select(study) %>% 
  summarize(.by = study,
            studyname = first(study),
            study = as.character(cur_group_id()))

dat <- dat_full %>%
  select(!c(gender, published, base, female_percentage,
            mean_age, student_judge, almanac, future,
            incentive, abs_value, winsor)) %>%
  mutate(distance = abs(firstestimate - advice) / firstestimate) %>%
  filter(!is.na(woa_raw) & !is.na(zconfidence) & distance < 1) %>%
  mutate(.by = study,
         study = cur_group_id()) %>%
  mutate(.by = id,
         id = cur_group_id()) %>%
  mutate(.by = c(study, trial),
         trial = cur_group_id())

dat %>% summary()


## Create trimmed version of DWOA (probably unnecessary)
dat$woa_winsor_trim <- dat$woa_winsor
dat$woa_winsor_trim[dat$woa_winsor == 1] <- .999
dat$woa_winsor_trim[dat$woa_winsor == 0] <- .001


## Subset predictors and create interaction terms
x <- dat %>%
  mutate(intercept = 1,
         distance = distance,
         confidence = zconfidence,
         distance_confidence = distance * confidence) %>%
  select(intercept, distance, confidence, distance_confidence)

# ,
# randomadvice = random_advice,
# expertadvisor = expert_advisor,
# random_expert = randomadvice * expertadvisor
# ,
# randomadvice, expertadvisor, random_expert

## Create test/toy data

testdat <- expand.grid(distance = c(mean(x$distance) - sd(x$distance),
                                    mean(x$distance),
                                    mean(x$distance) + sd(x$distance)),
                       confidence = c(mean(x$confidence) - sd(x$confidence),
                                      mean(x$confidence),
                                      mean(x$confidence) + sd(x$confidence))) %>%
  mutate(distance_confidence = distance * confidence)

# ,
# randomadvice = c(0, 1),
# expertadvisor = c(0, 1)
# ,
# random_expert = randomadvice * expertadvisor

condition_names <- expand.grid(distance = c("DL", "DM", "DH"),
                               confidence = c("CL", "CM", "CH")) %>%
  mutate(condition = paste0(distance, confidence)) %>%
  pull(condition)

# ,
# randomadvice = c("NR", "YR"),
# expertadvisor = c("NE", "YE")
# ,
# lvl3_condition = paste0(randomadvice, confidence)

## Scale priors

pr_v <- rep(NA, ncol(x) - 1) #no prior for the intercept?

for (i in 1:length(pr_v)){
  sdi <- sd(x[, i + 1], na.rm = T)
  if(length(levels(factor(x[, i]))) == 2) { # if binary?
    pr_v[i] <- 3
  }
  else {
    pr_v[i] <- 3 / sdi #why
  }
}


## Create DAC Variable

dat$DAC <- ifelse(dat$woa_winsor == 0, 2, 
                  ifelse(dat$woa_winsor == 1, 3, 
                         1))

k <- ncol(x)

## Run model

helmer_stan <- stan("DAC Helmer 2.stan",
                    data = list(N = nrow(dat), # number observations
                                ncat = 3, #number of DAC categories
                                Y1 = dat$DAC, # DAC variable
                                K = k, #number of predictors + 1 for intercept
                                X = x, #predictor matrix
                                N_1 = length(unique(dat$id)), # number of unique participants
                                M_1 = 1, # number of random effects for person, just intercepts here
                                J_1 = dat$id, # participant ids
                                N_2 = length(unique(dat$trial)), # number of unique items
                                M_2 = 1, # number of random effects for items
                                J_2 = dat$trial, # item ids
                                
                                N_3 = length(unique(dat$study)), # number of unique studies
                                M_3 = 1, # number of random effects for studies
                                J_3 = dat$study, # study ids
                                
                                Y2 = dat$woa_winsor_trim, #continuous response variable for beta regression
                                Y2_complete = dat$woa_winsor, #untrimmed version
                                Z_1_1 = rep(1, nrow(dat)), # random effect for person (just 1s for intercept)
                                Z_2_1 = rep(1, nrow(dat)),  # random effect for item (just 1s for intercept)
                                
                                Z_3_1 = rep(1, nrow(dat)),  # random effect for study (just 1s for intercept)
                                
                                b_prior = pr_v, # scaled priors for coefficients
                                prior_only = 0, # draw from prior only and ignore likelihood?
                                N_test = nrow(testdat), # number of observations in test data
                                X_test = testdat), # test data
                    warmup = 1000, iter = 2500,
                    seed = 50401, init_r = .2)

saveRDS(helmer_stan, file = here::here("helmer_stan_2_4.rds"))
R Code Model Results
helmer_stan <- readRDS(here::here("helmer_stan_2_4.rds"))

traceplot(helmer_stan)

list_of_draws <- rstan::extract(helmer_stan, pars = "y_rep")
ppcplt <- ppc_dens_overlay(y = dat$woa_winsor_trim, yrep = list_of_draws$y_rep[1:50, ]) +
  theme(legend.position = "bottom",
        text = element_text(family = "Roboto"),
        plot.margin = margin(2, 10, 2, 10),
        legend.margin = margin(-10, 2, 2, 2))

saveRDS(ppcplt, here::here("Figures", "ppc_plt.rds"))

topfifs_plt <- dat %>%
  mutate(study = factor(study)) %>%
  left_join(studynames,
            by = "study") %>%
  mutate(fif = ifelse(woa_winsor_trim == .5, 1, 0)) %>%
  summarize(.by = studyname,
            fifprop = sum(fif) / n()) %>%
  filter(fifprop > .08) %>%
  arrange(desc(fifprop)) %>%
  ggplot(aes(y = fct_reorder(studyname, fifprop), x = fifprop)) +
  geom_col(fill = "#ff926d", alpha = .8) +
  scale_x_continuous("Proportion of .5s", labels = scales::percent_format()) +
  labs(y = NULL) +
  theme_minimal(base_size = 14) +
  theme(panel.grid.minor = element_blank(),
        panel.grid.major.x = element_line(linewidth = 1, linetype = "dashed"),
        panel.grid.major.y = element_blank())

dat %>%
  mutate(study = factor(study)) %>%
  left_join(studynames,
            by = "study") %>%
  mutate(fif = ifelse(woa_winsor_trim == .5, 1, 0)) %>%
  summarize(.by = studyname,
            fifprop = sum(fif) / n()) %>%
  saveRDS(file = here::here("Data Clean", "topfifs_dat.rds"))

saveRDS(topfifs_plt, here::here("Figures", "topfifs_plt.rds"))

dat %>%
  mutate(study = factor(study)) %>%
  left_join(studynames,
            by = "study") %>%
  mutate(fif = ifelse(woa_winsor_trim == .5, 1, 0)) %>%
  mutate(.by = studyname,
         fifprop = sum(fif) / n()) %>%
  ggplot(aes(x = woa_winsor_trim)) +
  geom_density(fill = "#bc5090", color = NA) +
  geom_histogram(aes(y = after_stat(density)),
                 bins = 9, alpha = .3,
                 color = "#6a5188", fill = "white",
                 linewidth = 1) +
  facet_wrap(~ fct_reorder(studyname, -fifprop)) +
  scale_x_continuous("WOA", breaks = c(0, .5, 1)) +
  cowplot::theme_cowplot()


extr <- function(var) {
  helmer_stan %>% 
    rstan::extract(var) %>%
    pluck(var) %>%
    data.frame() %>%
    select(1:length(condition_names)) %>%
    rename_all(~ condition_names) %>%
    pivot_longer(everything(),
                 names_to = "condition", values_to = substr(tolower(var), 1, nchar(var) - 2))
}

playdat <- cbind(extr("A_t"),
                 select(extr("B_t"), b),
                 select(extr("pc1_t"), pc1),
                 select(extr("pc2_t"), pc2),
                 select(extr("pc3_t"), pc3)) %>%
  mutate(distance = case_when(substr(condition, 2, 2) == "L" ~ testdat[1, "distance"],
                              substr(condition, 2, 2) == "M" ~ testdat[2, "distance"],
                              substr(condition, 2, 2) == "H" ~ testdat[3, "distance"]),
         confidence = case_when(substr(condition, 4, 4) == "L" ~ testdat[1, "confidence"],
                                substr(condition, 4, 4) == "M" ~ testdat[4, "confidence"],
                                substr(condition, 4, 4) == "H" ~ testdat[7, "confidence"]))


playdat %>%
  pivot_longer(c("pc1", "pc2", "pc3"),
               names_to = "choice", values_to = "prob") %>%
  mutate(choice = factor(choice,
                         levels = c("pc2", "pc3", "pc1")),
         distance = factor(round(distance, 2), ordered = T),
         confidence = factor(round(confidence, 2), ordered = T)) %>% 
  filter(choice == "pc1") %>%
  ggplot(aes(y = prob, x = confidence, fill = distance)) +
  geom_violin(color = NA, width = .8, alpha = .8) + 
  stat_summary(aes(x = confidence, group = distance),
               fun = "mean",
               geom = "point",
               color = "white",
               position = position_dodge(.8)) +
  scale_x_discrete("confidence") +
  scale_y_continuous("P(Compromise)",
                     limits = c(0, 1)) +
  scale_fill_manual(values = c("#ffd6ed", "#df94be", "#bc5090")) +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        panel.grid.major.x = element_blank(),
        legend.position = "bottom")


test_rbeta_gen <- matrix(NA, ncol = 1, nrow = nrow(playdat))

set.seed(203001)
for (i in 1:nrow(playdat)){
  for(j in 1:ncol(test_rbeta_gen)){
    p <- rmultinom(1, 1, c(playdat$pc1[i], playdat$pc2[i], playdat$pc3[i]))
    if (p[2] == 1){
      test_rbeta_gen[i, j] <- 0
    }
    else if (p[3] == 1){
      test_rbeta_gen[i, j] <- 1
    }
    else{
      test_rbeta_gen[i, j] <- rbeta(1, playdat$a[i], playdat$b[i])
    }
  }
}

test_rbeta <- data.frame(test_rbeta_gen) %>%
  cbind(select(playdat, condition, distance, confidence)) %>%
  select(condition, test_rbeta_gen, distance, confidence) %>%
  rename(dens = test_rbeta_gen)

test_rbeta_mean <- test_rbeta %>%
  filter(dens > 0 & dens < 1) %>%
  summarise(.by = c(distance, confidence),
            dens = mean(dens)) %>%
  mutate(distance = factor(round(distance, 2), ordered = T),
         confidence = factor(round(confidence, 2), ordered = T))


test_rbeta %>%
  mutate(distance = factor(round(distance, 2), ordered = T),
         confidence = factor(round(confidence, 2), ordered = T)) %>% 
  ggplot(aes(x = confidence, y = dens, fill = distance)) + 
  geom_hline(yintercept = .5, color = "gray60") + 
  geom_violin(width = .5, color = NA, alpha = .8,
              position = position_dodge(.9)) +  
  geom_point(data = test_rbeta_mean,
             aes(x = confidence, y = dens),
             position = position_dodge(.9),
             size = 2) +
  stat_summary(aes(x = confidence, group = distance),
               fun = "mean",
               geom = "point",
               color = "gray40",
               shape = 21,
               fill = "white",
               position = position_dodge(.9)) +
  labs(fill = "distance", x = "confidence", y = "WOA") +
  scale_fill_manual(values = c("#ffd6ed", "#df94be", "#bc5090")) +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        panel.grid.major.x = element_blank(),
        legend.position = "bottom")


inits_dat <- data.frame(par = 1:nrow(data.frame(summary(helmer_stan, pars = "r_1_eta1_1")$summary)),
                        eta1 = data.frame(summary(helmer_stan, pars = "r_1_eta1_1")$summary)$mean +
                          data.frame(summary(helmer_stan, pars = "Intercept_eta1")$summary)$mean,
                        eta2 = data.frame(summary(helmer_stan, pars = "r_3_eta2_1")$summary)$mean +
                          data.frame(summary(helmer_stan, pars = "Intercept_eta2")$summary)$mean) %>% 
  mutate(Decline = exp(eta1)/(1 + exp(eta1) + exp(eta2)),
         Adopt = exp(eta2)/(1 + exp(eta1) + exp(eta2)),
         Compromise = 1/(1 + exp(eta1) + exp(eta2))) %>% 
  select(par, Decline, Adopt, Compromise) %>%
  pivot_longer(c("Decline", "Adopt", "Compromise"),
               names_to = "Parameter", values_to = "P") %>%
  mutate(Parameter = factor(Parameter, levels = c("Decline", "Adopt", "Compromise")))


ggplot(inits_dat, aes(x = Parameter, y = round(P, 2), fill = Parameter)) + 
  geom_violin(adjust = 1, color = NA, alpha = .8) + 
  theme_minimal() +
  xlab("Intercept") +
  ylab("Probability") +
  guides(fill = "none") +
  scale_fill_manual(values = c("#003f5c", "#bc5090", "#ffa600")) +
  coord_cartesian(ylim = c(0, 1), expand = 0) +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        panel.grid.major.x = element_blank(),
        legend.position = "bottom")

helmer_stan %>%
  rstan::extract(pars = c("r_5_eta1_1", "r_6_eta2_1")) %>%
  as.data.frame() %>% 
  pivot_longer(everything(),
               names_to = "study", values_to = "est") %>%
  separate_wider_delim(study, ".", names = c("par", "study")) %>%
  filter(study %in% 1:10) %>%
  left_join(studynames,
            by = "study") %>%
  mutate(odds = exp(est),
         prob = odds / (1 + odds)) %>%
  ggplot(aes(x = prob, y = studyname, color = par, fill = par)) +
  geom_boxplot(width = .25, linewidth = .8, outliers = F, position = position_dodge(.5)) +
  stat_summary(geom = "point", fun = mean, color = "white", position = position_dodge(.5), size = 1.5, shape = 16) +
  scale_y_discrete(NULL) +
  scale_x_continuous("estimate", breaks = c(0, .25, .5, .75, 1),
                     labels = scales::percent_format()) +
  scale_color_manual(NULL, values = c("#6a5188", "#c77998")) +
  scale_fill_manual(NULL, values = c("#6a5188", "#c77998")) +
  coord_cartesian(xlim = c(0, 1)) +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        legend.position = "bottom")

p_DAC_forestplt <- helmer_stan %>%
  rstan::extract(pars = c("r_5_eta1_1", "r_6_eta2_1", "Intercept_eta1", "Intercept_eta2")) %>%
  as.data.frame() %>% 
  pivot_longer(-c(Intercept_eta1, Intercept_eta2),
               names_to = "study", values_to = "est") %>% 
  separate_wider_delim(study, ".", names = c("par", "study")) %>%
  mutate(.by = c(study, par),
         rep = row_number()) %>%
  pivot_wider(id_cols = c("study", "rep", "Intercept_eta1", "Intercept_eta2"),
              names_from = par, values_from = est) %>%
  mutate(eta1 = r_5_eta1_1 + Intercept_eta1,
         eta2 = r_6_eta2_1 + Intercept_eta2,
         .keep = "unused") %>%
  left_join(studynames,
            by = "study") %>%
  mutate(Decline = exp(eta1) / (1 + exp(eta1) + exp(eta2)),
         Adopt = exp(eta2) / (1 + exp(eta1) + exp(eta2)),
         Compromise = 1/(1 + exp(eta1) + exp(eta2))) %>% 
  pivot_longer(c(Decline, Adopt, Compromise),
               names_to = "par", values_to = "prob") %>%
  mutate(Parameter = factor(par, levels = c("Decline", "Adopt", "Compromise"))) %>%
  {ggplot(., aes(x = prob, y = studyname, color = fct_inorder(par), fill = fct_inorder(par))) +
      geom_boxplot(width = .3, linewidth = 1.2, outliers = F, position = position_dodge(.5)) +
      stat_summary(geom = "point", fun = mean,
                   color = rep(c("#80A4C2", "#E0BAD7", "#FFE3B3"), times = 147 / 3),
                   position = position_dodge(.5), size = 1.5, shape = 16) +
      scale_y_discrete(NULL) +
      scale_x_continuous(NULL, breaks = c(0, .25, .5, .75, 1),
                         labels = scales::percent_format()) +
      scale_color_manual(NULL, values = c("#003f5c", "#bc5090", "#ffa600")) +
      scale_fill_manual(NULL, values = c("#003f5c", "#bc5090", "#ffa600")) +
      coord_cartesian(xlim = c(0, 1)) +
      theme_minimal() +
      theme(panel.grid.minor = element_blank(),
            legend.position = "none")}

p_DAC_forestplt_legend <- (data.frame(par = factor(c(1, 2, 3), labels = c("Decline", "Adopt", "Compromise")),
                                      y = c(1, 2, 3)) %>%
                             ggplot(aes(x = par, y = y, color = par)) +
                             geom_point() +
                             scale_color_manual(NULL, values = c("#003f5c", "#bc5090", "#ffa600")) +
                             guides(color = guide_legend(override.aes = list(size = 4))) +
                             theme_minimal(base_size = 15) +
                             theme(legend.position = "top",
                                   legend.margin = margin(t = -10, unit = "pt")) )%>%
  cowplot::get_plot_component('guide-box-top', return_all = TRUE)

saveRDS(p_DAC_forestplt, here::here("Figures", "p_DAC_forestplt.rds"))
saveRDS(p_DAC_forestplt_legend, here::here("Figures", "p_DAC_forestplt_legend.rds"))


# study level intercepts 
stdylvl_dat <- data.frame(par = 1:nrow(data.frame(summary(helmer_stan, pars = "r_5_eta1_1")$summary)),
                          eta1 = data.frame(summary(helmer_stan, pars = "r_5_eta1_1")$summary)$mean +
                            data.frame(summary(helmer_stan, pars = "Intercept_eta1")$summary)$mean,
                          eta2 = data.frame(summary(helmer_stan, pars = "r_6_eta2_1")$summary)$mean +
                            data.frame(summary(helmer_stan, pars = "Intercept_eta2")$summary)$mean) %>% 
  mutate(Decline = exp(eta1)/(1 + exp(eta1) + exp(eta2)),
         Adopt = exp(eta2)/(1 + exp(eta1) + exp(eta2)),
         Compromise = 1/(1 + exp(eta1) + exp(eta2))) %>% 
  select(par, Decline, Adopt, Compromise) %>%
  pivot_longer(c("Decline", "Adopt", "Compromise"),
               names_to = "Parameter", values_to = "P") %>%
  mutate(Parameter = factor(Parameter, levels = c("Decline", "Adopt", "Compromise")))


studylvl_plt <- ggplot(stdylvl_dat, aes(x = Parameter, y = round(P, 2), fill = Parameter, color = Parameter)) + 
  geom_violin(color = NA, alpha = .6) + 
  geom_jitter(shape = 16, alpha = .8, width = .2) +
  theme_minimal() +
  xlab("Intercept") +
  ylab("Probability") +
  guides(fill = "none") +
  scale_fill_manual(values = c("#003f5c", "#bc5090", "#ffa600")) +
  scale_color_manual(values = c("#003f5c", "#bc5090", "#ffa600")) +
  coord_cartesian(ylim = c(0, 1), expand = 0) +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        panel.grid.major.x = element_blank(),
        legend.position = "none")

saveRDS(studylvl_plt, here::here("Figures", "studylvl_plt.rds"))

Results

Displayed below is a plot of the distribution of WOA values as per the data and as per the model predictions. We can see the large proportion of WOA values near zero and one, with a range of values between as the DAC framework would expect.

One noticeable piece here is the bump at .5 exactly, with lower density surrounding it. This seems to be a study-level phenomenon—some studies have many responses with WOA values of exactly .5. Below are the studies with the highest proportions of the .5 responses.

The average proportion of .5’s in an given study was 5.24%.

It may be helpful to identify why certain studies have more responses of exactly .5 than others—the model seems to be fitting the data well across all values except the center.

Below we can see the distribution of the study-level intercepts for the probability of responses falling into the DAC categories. To the right we can see those box plots representing the distribution of those sampled intercepts for each study. We can see decent variability in these distributions between studies.

Next step is to decide how to handle these high .5 studies and to add predictors to the model! Looking forward to hearing what predictors would be best to include in this analysis. Also, if there are other ways of analyzing and presenting these results that would be helpful, please let me know!